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Abstract 


Compressive  Sensing  (CS)  combines  sampling  and  compression  into  a  single  sub- 
Nyquist  linear  measurement  process  for  sparse  and  compressible  signals.  In  this 
paper,  we  extend  the  theory  of  CS  to  include  signals  that  are  concisely  repre¬ 
sented  in  terms  of  a  graphical  model.  In  particular,  we  use  Markov  Random  Fields 
(MRFs)  to  represent  sparse  signals  whose  nonzero  coefficients  are  clustered.  Our 
new  model-based  recovery  algorithm,  dubbed  Lattice  Matching  Pursuit  (LaMP), 
stably  recovers  MRF-modeled  signals  using  many  fewer  measurements  and  com¬ 
putations  than  the  current  state-of-the-art  algorithms. 

1  Introduction 

The  Shannon/Nyquist  sampling  theorem  tells  us  that  in  order  to  preserve  information  when  uni¬ 
formly  sampling  a  signal  we  must  sample  at  least  two  times  faster  than  its  bandwidth.  In  many 
important  and  emerging  applications,  the  resulting  Nyquist  rate  can  be  so  high  that  we  end  up  with 
too  many  samples  and  must  compress  in  order  to  store  or  transmit  them.  In  other  applications,  in¬ 
cluding  imaging  systems  and  high-speed  analog-to-digital  converters,  increasing  the  sampling  rate 
or  density  beyond  the  current  state-of-the-art  is  very  expensive.  A  transform  compression  system 
reduces  the  effective  dimensionality  of  an  A/'-dimensional  signal  by  re-representing  it  in  terms  of  a 
sparse  expansion  in  some  basis  (for  example,  the  discrete  cosine  transform  for  JPEG).  By  sparse  we 
mean  that  only  K  N  of  the  basis  coefficients  are  nonzero. 

The  new  theory  of  compressive  sensing  (CS)  combines  sampling  and  compression  into  a  single  sub- 
Nyquist  linear  measurement  process  for  sparse  signals  [1, 2].  In  CS,  we  measure  not  periodic  signal 
samples  but  rather  inner  products  with  M  <  N  known  measurement  vectors;  random  measurement 
vectors  play  a  starring  role.  We  then  recover  the  signal  by  searching  for  the  sparsest  signal  that 
agrees  with  the  measurements.  Research  in  CS  to  date  has  focused  on  reducing  both  the  number 
of  measurements  M  (as  a  function  of  N  and  K)  and  on  reducing  the  computational  complexity  of 
the  recovery  algorithm.  Today’s  state-of-the-art  CS  systems  can  recover  iT-sparse  and  more  general 
compressible  signals  using  M  =  0{K\og{N/K))  measurements  using  polynomial-time  linear 
programming  or  greedy  algorithms. 

While  such  sub-Nyquist  measurement  rates  are  impressive,  our  contention  in  this  paper  is  that  for 
CS  to  truly  live  up  its  name  it  must  more  fully  leverage  concepts  from  state-of-the-art  compression 
algorithms.  In  virtually  all  such  algorithms,  the  key  ingredient  is  a  signal  model  that  goes  beyond 
simple  sparsity  by  providing  a  model  for  the  basis  coefficient  structure.  For  instance,  JPEG  does  not 
only  use  the  fact  that  most  of  the  DCT  of  a  natural  image  are  small.  Rather,  it  also  exploits  the  fact 
that  the  values  and  locations  of  the  large  coefficients  have  a  particular  structure  that  is  characteristic 
of  natural  images.  Coding  this  structure  using  an  appropriate  model  enables  JPEG  and  other  similar 
algorithms  to  compress  images  close  to  the  maximum  amount  possible,  and  significantly  better  than 
a  naive  coder  that  just  assigns  bits  to  each  large  coefficient  independently. 
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In  this  paper,  we  extend  the  theory  of  CS  to  include  signals  that  are  concisely  represented  in  terms 
of  a  graphical  model  [3].  We  use  Markov  Random  Fields  (MRFs)  to  represent  sparse  signals  whose 
nonzero  coefficients  also  cluster  together.  Our  new  model-based  recovery  algorithm,  dubbed  Lattice 
Matching  Pursuit  (LaMP),  performs  rapid  and  numerically  stable  recovery  of  MRF-modeled  signals 
using  far  fewer  measurements  than  standard  algorithms. 

The  organization  of  the  paper  is  as  follows.  In  Sections  2  and  3,  we  briefly  review  the  CS  and  MRF 
theories.  We  develop  LaMP  in  Section  4  and  present  experimental  results  in  Section  5  using  both 
simulated  and  real  world  data.  We  conclude  by  offering  our  perspective  on  the  future  directions  of 
model-based  CS  research  in  Section  6. 

2  Compressive  sensing:  From  sparsity  to  structured  sparsity 

Sparse  signal  recovery.  Any  signal  x  G  can  be  represented  in  terms  of  N  coefficients  {ai} 
in  a  basis  stacking  the  as  columns  into  the  matrix  ^nxN,  we  can  write  succinctly 

that  X  =  ^6.  We  say  that  x  has  a  sparse  representation  if  only  K  N  entries  of  6  are  nonzero, 
and  we  denote  by  LIk  the  set  of  (^)  possible  supports  for  such  iT-sparse  signals.  We  say  that  x  is 
compressible  if  the  sorted  magnitudes  of  the  entries  of  6  decay  rapidly  enough  that  it  can  be  well 
approximated  as  AT-sparse. 

In  Compressive  Sensing  (CS),  the  signal  is  not  acquired  by  measuring  x  or  cx  directly.  Rather,  we 
measure  the  M  <  N  linear  projections  y  =  ^x  =  ^^6  using  the  M  x  N  matrix  In  the 
sequel,  without  loss  of  generality,  we  focus  on  two-dimensional  image  data  and  assume  that  ^  =  / 
(the  N  X  N  identity  matrix)  so  that  x  =  6.  The  most  commonly  used  criterion  for  evaluating  the 
quality  of  a  CS  measurement  matrix  is  the  restricted  isometry  property  (RIP).  A  matrix  ^  satisfies 
the  AT-RIP  if  there  exists  a  constant  (5^  >  0  such  that  for  all  AT-sparse  vectors  x, 

(1  -  Sk)\\x\\2  <  ||^»||2  <  (1  +(5if)||®||2.  (1) 

The  recovery  of  the  set  of  significant  coefficients  Oi  is  achieved  using  optimization:  we  search  for 
the  sparsest  6  that  agrees  with  the  measurements  y.  While  in  principle  recovery  is  possible  using  a 
matrix  that  has  the  2Ar-RIP  with  d2K  <  1,  such  an  optimization  is  combinatorially  complex  (NP- 
complete)  and  numerically  unstable.  If  we  instead  use  a  matrix  that  has  the  3Ff-RIP  with  <1/2, 
then  numerically  stable  recovery  is  possible  in  polynomial  time  using  either  a  linear  program  [1,2] 
or  a  greedy  algorithm  [4].  Intriguingly,  a  random  Gaussian  or  Bernoulli  matrix  works  with  high 
probability,  leading  to  a  randomized  acquisition  protocol  instead  of  uniform  sampling. 

Structured  sparsity.  While  many  natural  and  manmade  signals  and  images  can  be  described  to  the 
first-order  as  sparse  or  compressible,  their  sparse  supports  (set  of  nonzero  coefficients)  often  have  an 
underlying  order.  This  order  plays  a  central  role  in  the  transform  compression  literature,  but  it  has 
barely  been  explored  in  the  CS  context  [5, 6].  The  theme  of  this  paper  is  that  by  exploiting  a  priori 
information  on  coefficient  structure  in  addition  to  signal  sparsity,  we  can  make  CS  better,  stronger, 
and  faster. 

Figure  1  illustrates  a  real-world  example  of  structured  sparse  support  in  a  computer  vision  applica¬ 
tion.  Figure  1(b)  is  a  background  subtracted  image  computed  from  a  video  sequence  of  a  parking 
lot  with  two  moving  people  (one  image  frame  is  shown  in  Figure  1(a)).  The  moving  people  form 
the  foreground  (white  in  (b)),  while  the  rest  of  the  scene  forms  the  background  (black  in  (b)).  The 
background  subtraction  was  computed  from  CS  measurements  of  the  video  sequence.  Background 
subtracted  images  play  a  fundamental  role  in  making  inferences  about  objects  and  activities  in  a 
scene  and,  by  nature,  they  have  structured  spatial  sparsity  corresponding  to  the  foreground  innova¬ 
tions.  In  other  words,  compared  to  the  scale  of  the  scene,  the  foreground  innovations  are  usually  not 
only  sparse  but  also  clustered  in  a  distinct  way,  e.g.,  corresponding  to  the  silhouettes  of  humans  and 
vehicles.  Nevertheless,  this  clustering  property  is  not  exploited  by  current  CS  recovery  algorithms. 

Probabilistic  RIP.  The  RIP  treats  all  possible  Ff-sparse  supports  equally.  However,  if  we  incor¬ 
porate  a  probabilistic  model  on  our  signal  supports  and  consider  only  the  signal  supports  with  the 
highest  likelihoods,  then  we  can  potentially  do  much  better  in  terms  of  the  required  number  of 
measurements  required  for  stable  recovery. 

We  say  that  ^  satisfies  the  (A",  e) -probabilistic  RIP  (PRIP)  if  there  exists  a  constant  >  0  such 
that  for  a  AT-sparse  signal  x  generated  by  a  specified  probabilistic  signal  model,  (1)  holds  with 
probability  at  least  1  —  e  over  the  signal  probability  space.  We  propose  a  preliminary  result  on  the 
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(a)  (b)  (c) 

Figure  1:  A  camera  surveillance  image  (b)  with  the  background  subtracted  image  (b)  recovered  using  com¬ 
pressive  measurements  of  the  scene.  The  background  subtracted  image  has  resolution  N  =  240  x  320  and 
sparsity  K  =  390.  (c)  A  random  K  —  390  sparse  image  in  N  =  240  x  320  dimensions.  The  probability  of 
image  (b)  under  the  Ising  model  is  approximately  10®^®  times  greater  than  the  probability  of  image  (c). 


number  of  random  measurements  needed  under  this  new  criterion;  this  is  a  direct  consequence  of 
Theorem  5.2  of  [7].  (See  also  [8]  for  related  results.) 

Lemma  1.  Suppose  that  M,  N,  and  S  G  [0,1]  are  given  and  that  the  signal  x  is  generated  by 
a  known  probabilistic  model.  Let  VtK.e  ^  denote  the  smallest  set  of  supports  for  which  the 
probability  that  a  K -sparse  signal  x  has  supp{x)  ^  Is  less  than  e,  and  denote  D  =  \LlK,el 
If  ^  is  a  matrix  with  normalized  i.i.d.  Gaussian  or  Bernoulli/Rademacher  (±1)  random  entries, 
then  $  has  the  €)-PRIP  with  probability  at  least  1  —  if  M  >  ci{K  +  \og{D)),  where 

Cl ,  C2  >0  depend  only  on  the  PRIP  constant  5k- 

To  illustrate  the  significance  of  the  above  lemma,  consider  the  following  probabilistic  model  for 
an  A/'-dimensional,  iT-sparse  signal.  We  assume  that  the  locations  of  the  non-zeros  follow  a  ho¬ 
mogeneous  Poisson  process  with  rate  A  =  —\og{e/K)N~^,  where  a  1.  Thus,  a  particular 
non-zero  coefficient  occurs  within  a  distance  of  of  its  predecessor  with  probability  1  —  e/K.  We 
determine  the  size  of  the  likely  -sparse  support  set  LIk  under  this  particular  signal  model  using 
a  simple  counting  argument.  The  location  of  the  first  non-zero  coefficients  is  among  the  first 
indices  with  probability  1  —  e/K.  After  fixing  the  location  of  the  first  coefficient,  the  location  of 
the  second  coefficient  is  among  the  next  indices  immediately  following  the  first  location  with 
probability  1  —  e/K.  Proceeding  this  way,  after  the  locations  of  the  first  j  —  1  coefficients,  have  been 
fixed,  we  have  that  the  non-zero  coefficient  is  among  candidate  locations  with  probability 
1  —  e/K.  In  this  way,  we  obtain  a  set  of  supports  LIk,€  of  size  that  will  occur  with  probability 
{1  —  e/K)^  >  1  —  e.  Thus  for  the  (AT,  e)-PRIP  to  hold  for  a  random  matrix,  the  matrix  must  have 
M  =  cK{l  -|-  alogN)  rows,  as  compared  to  the  cKlog{N/K)  rows  required  for  the  standard 
iT-RIP  to  hold.  When  a  is  on  the  order  of  (log  the  number  of  measurements  required  and  the 

complexity  of  the  solution  method  grow  essentially  linearly  in  K,  which  is  a  considerable  improve¬ 
ment  over  the  best  possible  M  =  0{K  \og{N/ K))  measurements  required  without  such  a  priori 
information. 

3  Graphical  models  for  compressive  sensing 

Clustering  of  the  nonzero  coefficients  in  a  sparse  signal  representation  can  be  realistically  captured 
by  a  probabilistic  graphical  model  such  as  a  Markov  random  field  (MRF);  in  this  paper  we  will 
focus  for  concreteness  on  the  classical  Ising  model  [9] . 

Support  model.  We  begin  with  an  Ising  model  for  the  signal  support.  Suppose  we  have  a  7f-sparse 
signal  X  G  whose  support  is  represented  bys  G  {  —  1,1}^  such  that  Si  =  —1  when  Xi  =  D  and 
Si  =  1  when  7^  0.  The  probability  density  function  (PDF)  of  the  signal  support  can  be  modeled 
using  a  graph  Gg  =  (Vg,  Eg),  where  Vg  =  {1, . . . ,  A^}  denotes  a  set  of  N  vertices  -  one  for  each 
of  the  support  indices  -  and  Eg  denotes  the  set  of  edges  connecting  support  indices  that  are  spatial 
neighbors  (see  Figure  2(a)).  The  contribution  of  the  interaction  between  two  elements  {5^,  Sj}  in 
the  support  of  x  is  controlled  by  the  coefficient  Xij  >  0.  The  contribution  of  each  element  Si  is 
controlled  by  a  coefficient  A^,  resulting  in  the  following  PDF  for  the  sparse  support  s\ 

p(s;  A)  =  exp  <  ^  XijSiSj  +  XiSi  -  Zs{X)  >  ,  (2) 

where  Z5  ( A)  is  a  strictly  convex  partition  function  with  respect  to  A  that  normalizes  the  distribution 
so  that  it  integrates  to  one.  The  parameter  vector  A  quantifies  our  prior  knowledge  regarding  the 
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Figure  2:  Example  graphical  models:  (a)  Ising  model  for  the  support,  (b)  Markov  random  field  model  for  the 
resulting  coefficients,  (c)  Markov  random  field  with  CS  measurements. 

signal  support  s  and  consists  of  the  edge  interaction  parameters  Xij  and  the  vertex  bias  parameters 
Xi.  These  parameters  can  be  learned  from  data  using  -minimization  techniques  [10]. 

The  Ising  model  enforces  coefficient  clustering.  For  example,  compare  the  clustered  sparsity  of  the 
real  background  subtracted  image  in  Figure  1(b)  with  the  dispersed  “independent”  sparsity  of  the 
random  image  in  Figure  1(c).  While  both  images  (b)  and  (c)  are  equally  sparse,  under  a  trained  Ising 
model  (Xij  =  0.45  and  =  0),  the  image  (b)  is  approximately  10^^^  times  more  likely  than  the 
image  (c). 

Signal  model.  Without  loss  of  generality,  we  focus  on  2D  images  that  are  sparse  in  the  space 
domain,  as  in  Figure  1(b).  Leveraging  the  Ising  support  model  from  above,  we  apply  the  MRF 
graphical  model  in  Figure  2(b)  for  the  pixel  coefficient  values.  Under  this  model,  the  support  is 
controlled  by  an  Ising  model,  and  the  signal  values  are  independent  given  the  support.  We  now 
develop  a  joint  PDF  for  the  image  pixel  values  x,  the  support  labels  s,  and  the  CS  measurements  y. 

We  begin  with  the  support  PDF  p{s)  from  (2)  and  assume  that  we  are  equipped  with  a  sparsity- 
promoting  PDF p(x\s)  for  X  given  s.  The  most  commonly  used  PDF  is  the  Laplacian  density  (which 
is  related  to  the  ^i-norm  of  x);  however,  other  reference  priors,  such  as  generalized  Gaussians  that 
are  related  to  the  ^^-norm  of  x,p  <  1,  can  be  more  effective  [11].  We  assume  that  the  measurements 
y  are  corrupted  by  i.i.d.  Gaussian  noise,  i.e.,  p(y\x)  =  J\f  J),  where  is  the  unknown 

noise  variance. 

From  Figure  2(c),  it  is  easy  to  show  that,  given  the  signal  x,  the  signal  support  s  and  the  compressive 
measurements  y  are  independent  using  the  D-separation  property  of  graphs  [12].  Hence,  the  joint 
distribution  of  the  vertices  in  the  graph  in  Figure  2(b)  can  be  written  as 

p{z)  =  p{s,  X,  y)  =  p{s,  x)p{y\s,  x)  =  p{s)p{x\s)p{y\x) ,  (3) 

where  a  =  x'^,  Then,  (3)  can  he  explicitly  written  as 

p(2)(xexp'|  ^  AijSiSj  +  ^  [AiSi +  log(p(xi|si))]  -  ^  ||y  -  ^a;||2  >  .  (4) 

4  Lattice  matching  pursuit 

Using  the  coefficient  graphical  model  from  Section  3,  we  are  now  equipped  to  develop  a  new  model- 
based  CS  signal  recovery  algorithm.  Lattice  Matching  Pursuit  (LaMP)  is  a  greedy  algorithm  for 
signals  on  2D  lattices  (images)  in  which  the  likelihood  of  the  signal  support  is  iteratively  evaluated 
and  optimized  under  an  Ising  model.  By  enforcing  a  graphical  model,  (/)  partial  knowledge  of 
the  sparse  signal  support  greatly  decreases  the  ambiguity  and  thus  size  of  the  search  space  for  the 
remaining  unknown  part,  accelerating  the  speed  of  the  algorithm;  and  (ii)  signal  supports  of  the  same 
size  but  different  structures  result  in  different  likelihoods  (recall  Figure  1(b)  and  (c)),  decreasing  the 
required  number  of  CS  measurements  and  increasing  the  numerical  stability. 

Algorithm.  The  LaMP  pseudocode  is  given  in  Algorithm  1 .  Similar  to  other  greedy  recovery  al¬ 
gorithms  such  as  matching  pursuit  and  CoSaMP  [4],  each  iteration  of  LaMP  starts  by  estimating  a 
data  residual  given  the  current  estimate  of  the  signal  (Step  1).  After  calculating  the 

residual,  LaMP  calculates  a  temporary  signal  estimate  (Step  2)  denoted  by  xl^\  This  signal  esti¬ 
mate  is  the  sum  of  the  previous  estimate  and  accounting  for  the  current  residual. 

Using  this  temporary  signal  estimate  as  a  starting  point,  LaMP  then  maximizes  the  likelihood  (4) 
over  the  support  via  optimization  (Step  3).  This  can  be  efficiently  solved  using  graph  cuts  with 
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Algorithm  1:  LaMP  -  Lattice  Matching  Pursuit 
Input:  y,  =  0,  =  —  1,  and  K  (desired  sparsity). 

Output:  A  i^-sparse  approximation  x  of  the  acquired  signal. 

Algorithm: 

repeat  {Matching  Pursuit  Iterations} 

Step  1.  Calculate  data  residual: 

Step  2.  Propose  a  temporary  target  signal  estimate: 


X 


Step  3.  Determine  MAP  estimate  of  the  support  using  graph  cuts: 


^{i,j)eEs 

Step  4.  Estimate  target  signal: 

t  =  0;  =  1]  =  =  l]y; 

Step  5.  Iterate: 

/c  =  /c  +  1; 


XijSiSj  +  'Eiev,  +  log(p([a:f 


x{k}  =  Prunejt;  if}; 


until  Maximum  iterations  or 
Return  x  =  x^^^ . 


II  <  threshold; 


p{xi\si  =  -1)/! 


K 


p{Xi\Si  =  +1)1 


>  \ogp(xi\si  =  -1) 

e2  - 

es  -  ^ 


Ll 


\ogp{xi\si  =  +1) 


w. 


logei 


log  p(Xi\Si  =  - 

-  log  ei 


.  log  62 
■log  63 


logp(x^|s^  =  +  l) 
logei 


lA 

¥: 


U_i(xi;r) 


U+i(xi;T) 


Figure  3:  Geometrical  approximations  of  p{xi\si  =  —1)  and  log p{xi\si  =  +1). 


0{N)  complexity  [13].  In  particular,  for  planar  Ising  models,  the  global  minimum  of  the  problem 
can  be  obtained.  Once  a  likely  signal  support  is  obtained  in  Step  3,  LaMP  obtains  an  up¬ 
dated  signal  estimate  x^^^  using  least  squares  with  the  selected  columns  of  the  measurement  matrix 
$[:,  =  1]  and  pruning  back  to  the  largest  K  signal  coefficients  (Step  4).  Hence,  the  parameter 

K  controls  the  sparsity  of  the  approximation.  In  Step  4,  a  conjugate  gradient  method  is  used  for 
efficiently  performing  the  product  by  a  pseudoinverse.  If  the  graphical  model  includes  dependencies 
between  the  signal  values  Xi,  we  then  replace  the  pseudoinverse  product  by  a  belief  propagation 
algorithm  to  efficiently  solve  for  the  signal  values  x^^^  within  Step  4. 

Signal  log-likelihood  logp(x|s).  The  correct  signal  PDF  to  use  given  the  support  p{x\s)  is 
problem-dependent.  Here,  we  provide  one  approximation  that  mimics  the  £o  minimization  for  CS 
recovery  for  the  signal  graphical  model  in  Figure  2(c);  we  also  use  this  in  our  experiments  in  Sec¬ 
tion  5.  The  state  Si  =  1  represents  a  nonzero  coefficient;  thus,  all  nonzero  values  of  Xi  should 
have  equal  probability,  and  the  value  Xi  =  0  should  have  zero  probability.  Similarly,  the  state 
Si  =  —1  represents  a  zero-valued  coefficient;  thus,  the  mass  of  its  probability  function  is  concen¬ 
trated  at  zero.  Hence,  we  use  the  approximations  for  Xi  G  [— L,  L],  a  restricted  dynamic  range: 
p{xi\si  =  —1)  =  6{xi)  Sind  p{xi\si  =  !)  =  (!  —  S{xi))/2L.  However,  the  optimization  over 
the  joint  PDF  in  (4)  requires  a  “smoothing”  of  these  PDFs  for  two  reasons:  (/)  to  obtain  robustness 
against  noise  and  numerical  issues;  and  (ii)  to  extend  the  usage  of  the  algorithm  from  sparse  to 
compressible  signals. 

We  approximate  log p{xi\si  =  ±1)  using  the  parametric  form  illustrated  in  Figure  3.  Here,  the 
constant  r  is  a  slack  parameter  to  separate  large  and  small  signal  coefficients,  and  61+2,  and  63  are 
chosen  according  to  r  and  L  to  normalize  each  PDF.  We  also  denote  a  =  esL,  with  a  ^  1.  Using 
the  normalization  constraints,  it  is  possible  to  show  that  as  the  dynamic  range  increases. 


lim 

L^oo 


log  €2  ^  J_ 

log  ei  ra 


and 


lim 

L^oo 


log  63 

log  61 


0. 
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Hence,  we  approximate  the  likelihoods  using  the  utility  functions  Ug .  {x;  r)  that  follow  this  form. 
The  optimization  problem  used  by  Step  3  of  LaMP  to  determine  the  support  is  then  approximately 
equivalent  to  the  following  problem 


max 

S€{-1,+1}~ 


E 

{i,j)eEs 


X-ij  SiSj 


ieVs 


(5) 


where  A  =  If  the  signal  values  are  known  to  be  positive,  then  the  definitions  of  Ug.  can 

be  changed  to  enforce  the  positivity  during  estimation.  The  choice  of  Xij  is  related  to  the  desired 
sparseness  on  the  lattice  structure. 


To  enforce  a  desired  sparsity  K  on  the  lattice  structure,  we  apply  statistical  mechanics  results  on 
the  2D  Ising  model  and  choose  Xij  =  0.5arcsin((l  —  where  m  is  called  the  average 

magnetization.  In  our  recovery  problem,  the  average  magnetization  and  the  desired  signal  sparsity 

has  a  simple  relationship:  m  =  (+1)  x  K  ^  (—1)  x  {N  —  K)  /N.  We  set  =  0  unless  there 
is  prior  information  on  the  signal  support.  The  threshold  r  is  chosen  at  each  iteration  adaptively  by 
sorting  the  magnitudes  of  the  temporary  target  signal  estimate  coefficients  and  determining  the  bK 
threshold;  this  gives  preference  to  the  largest  bK  coefficients  that  attain  states  Si  =  1,  unless  the 
cost  incurred  by  enforcing  the  lattice  structure  is  too  large.  The  pruning  operation  in  Step  4  of  LaMP 
then  enforces  the  desired  sparsity  K. 


5  Experiments 

We  now  use  several  numerical  simulations  to  demonstrate  that  for  spatially  clustered  sparse  signals, 
which  have  high  likelihood  under  our  MRF  model,  LaMP  requires  far  fewer  measurements  and 
fewer  computations  for  robust  signal  recovery  than  state-of-the-art  greedy  and  optimization  tech¬ 
niques.^ 

Experiment  1:  Shepp-Logan  phantom.  Figure  4  (top  left)  shows  the  classical  N  =  100  x 
100  Shepp-Logan  phantom  image.  Its  sparsity  in  the  space  domain  is  K  =  1740.  We  obtained 
compressive  measurements  of  this  image,  which  were  then  immersed  in  additive  white  Gaussian 
noise  to  an  SNR  of  lOdB.  The  top  row  of  Figure  4  illustrates  the  iterative  image  estimates  obtained 
using  LaMP  from  just  M  =  2K  =  3480  random  Gaussian  measurements  of  the  noisy  target. 
Within  3  iterations,  the  support  of  the  image  is  accurately  determined;  convergence  occurs  at  the  5th 
iteration. 


Figure  4  (bottom)  compares  LaMP  to  CoSaMP  [4],  a  state-of-the-art  greedy  recovery  algorithm,  and 
fixed-point  continuation  (FPC)  [16],  a  state-of-the-art  ^i-norm  minimization  recovery  algorithm  us¬ 
ing  the  same  set  of  measurements.  Despite  the  presence  of  high  noise  (lOdB  SNR),  LaMP  perfectly 
recovers  the  signal  support  from  only  a  small  number  of  measurements.  It  also  outperforms  both 
CoSaMP  and  FPC  in  terms  of  speed. 

Experiment  2:  Numerical  stability.  We  demonstrate  LaMP’s  stability  in  the  face  of  substantial 
measurement  noise.  We  tested  both  LaMP  and  FPC  with  a  number  of  measurements  that  gave  close 
to  perfect  recovery  of  the  Shepp-Logan  phantom  in  the  presence  of  a  small  amount  of  noise;  for 
LaMP,  setting  M  =  1.7 K  suffices,  while  FPC  requires  M  =  AK.  We  then  studied  the  degradation 
of  the  recovery  quality  as  a  function  of  the  noise  level  for  both  algorithms.  For  reference,  a  value 
of  cr  =  20  corresponds  to  a  measurement- to-noise  ratio  of  just  6dB.  The  results  in  Figure  5(a) 
demonstrate  that  LaMP  is  stable  for  a  wide  range  of  measurement  noise  levels.  Indeed,  the  rate  of 
increase  of  the  LaMP  recovery  error  as  a  function  of  the  noise  variance  a  (a  measure  of  the  stability 
to  noise)  is  comparable  to  that  of  FPC,  while  using  far  fewer  measurements. 

Experiment  3:  Performance  on  real  background  subtracted  images.  We  test  the  recovery 
algorithms  over  a  set  of  background  subtraction  images.  The  images  were  obtained  from  a  test 
video  sequence,  one  image  frame  of  which  is  shown  in  Figure  1 ,  by  choosing  at  random  two  frames 
from  the  video  and  subtracting  them  in  a  pixel- wise  fashion.  The  large- valued  pixels  in  the  resulting 
images  are  spatially  clustered  and  thus  are  well-modeled  by  the  MRF  enforced  by  LaMP.  We  created 
100  different  test  images;  for  each  image,  we  define  the  sparsity  K  as  the  number  of  coefficients 

^We  use  the  GC Optimization  package  [13-15]  to  solve  the  support  recovery  problem  in  Step  3  in  Algorithm 
1  in  our  implementation  of  LaMP. 
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Noise-free  target 


LaMP  Iter.  #4 


LaMP  Iter.  #5,  0.9s  CoSaMP,  6.2s  FPC,  6.5s 
Figure  4:  Top:  LaMP  recovery  of  the  Shepp-Logan  phantom  (N  =  100  x  100,  K  =  1740,  SNR  =  lOdB) 
from  M  =  2K  —  3480  noisy  measurements.  Bottom:  Recoveries  from  LaMP,  CoSaMP,  and  FPC,  including 
running  times  on  the  same  computer. 


(a) 


Figure  5:  Performance  of  LaMP.  (a)  Maximum  recovery  error  over  1000  noise  iterations  as  a  function  of  the 
input  noise  variance.  LaMP  has  the  same  robustness  to  noise  as  the  FPC  algorithm,  (b)  Performance  over 
background  subtraction  dataset  of  100  images.  LaMP  achieves  the  best  performance  at  M  ^  2.bK,  while  both 
FPC  and  CoSaMP  require  M  >  bK  to  achieve  the  same  performance. 


that  contain  97%  of  the  image  energy.  We  then  performed  recovery  of  the  image  using  the  LaMP, 
CoSaMP,  and  FPC  algorithms  under  varying  number  of  measurements  M,  from  O.bK  to  bK.  An 
example  recovery  is  shown  in  Figure  6. 

For  each  test  and  algorithm,  we  measured  the  magnitude  of  the  estimation  error  normalized  by  the 
magnitude  of  the  original  image.  Figure  5(b)  shows  the  mean  and  standard  deviations  for  the  nor¬ 
malized  error  magnitudes  of  the  three  algorithms.  LaMP’s  graphical  model  reduces  the  number  of 
measurements  necessary  for  acceptable  recovery  quality  to  M  ^  2.bK,  while  the  standard  algo¬ 
rithms  require  M  >  bK  measurements  to  achieve  the  same  quality. 

6  Conclusions 

We  have  presented  an  initial  study  of  model-based  CS  signal  recovery  using  an  MRF  model  to  cap¬ 
ture  the  structure  of  the  signal’s  sparse  coefficients.  As  demonstrated  in  our  numerical  simulations, 
for  signals  conforming  to  our  model,  the  resulting  LaMP  algorithm  requires  significantly  fewer  CS 
measurements,  has  lower  computational  complexity,  and  has  equivalent  numerical  stability  to  the 
current  state-of-the-art  algorithms.  We  view  this  as  an  initial  step  toward  harnessing  the  power  of 
modern  compression  and  data  modeling  methods  for  CS  reconstruction. 

Much  work  needs  to  be  done,  however.  We  are  working  to  precisely  quantify  the  reduction  in  the 
required  number  of  measurements  (our  numerical  experiments  suggest  that  M  =  0{K)  is  sufficient 
for  stable  recovery)  and  computations.  We  also  assert  that  probabilistic  signal  models  hold  the  key 
to  formulating  inference  problems  in  the  compressive  measurement  domain  since  in  many  signal 
processing  applications,  signals  are  acquired  merely  for  the  purpose  of  making  an  inference  such  as 
a  detection  or  classification  decision. 
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